Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Note: I can’t discuss this, since it’s covered by an NDA, and I haven’t seen the report that OpenPhil received, but compared to what I see as a superforecaster on the questions it looks like the numbers you have from GJP are wrong.
Re footnote 15, did Luisa assume that the two events were independent and that’s how she got the 0.02%? (In reality I would think that they are strongly correlated.)
The second event was elicited as a conditional probability, so the math is correct, though again, it seems that the inputs are not. (But the language used here seems not to have noted that it was conditional. I may just be confused about what it is trying to say, as it seems unclear to me. Also, the GJP report would have explicitly discussed the superforecasters’ thoughts on what may cause the question to trigger, so again, I am confused by the footnote.)
Davidmainheim, thanks for raising this! The GJI data should be correct now — let me know if you notice any other inconsistencies.
Thanks for this post Luisa! Really nice resource and I wish I caught it earlier. A couple methodology questions:
Why do you choose an arithmetic mean for aggregating these estimates? It seems like there is an argument to be made that in this case we care about order-of-magnitude correctness, which would imply taking the average of the log probabilities. This is equivalent to the geometric mean (I believe) and is recommended for fermi estimates e.g. (here)[https://www.lesswrong.com/posts/PsEppdvgRisz5xAHG/fermi-estimates].
Do you have a sense for how much, if any, these estimates are confounded by the variable of time? Are all estimates trying to guess likelihood of war in the few years following the estimate, or do some have longer time horizons (you mention this explicitly for a number of them, but struggling to find for all. Sorry if I missed)? If these are forecasting something close to the instantaneous yearly probability, do you think we should worry about adjusting estimates by when they were made, in case i.e. a lot has changed between 2005 and now?
Related to the above, do you believe risk of nuclear war is changing with time or approximately constant?
Did you consider any alternative schemes to weighting these estimates equally? I notice that for example the GJI estimate on US-Russia nuclear war is more than an order of magnitude lower than the rest, but is also the group I’d put my money on based on forecasting track record. Do you find these estimates approximately equally credible?
Curious for your thoughts!
This is a good point.
I’d add that as a general rule when aggregating binary predictions one should default to the average log odds, perhaps with an extremization factor as described in (Satopää et al, 2014).
The reasons are a) empirically, it seems to work better, b) the way Bayes rules works it seems to suggest very strongly than log odds are the natural unit of evidence, c) apparently there are some complex theoretical reasons (“external bayesianism”) why this is better (the details go a bit over my head).
FYI, this post by Jaime has an extended discussion of this issue.
Very minor: “GJP” should be “GJI.” Good Judgment Project ended with the end of the IARPA ACE tournaments. The company that pays superforecasters from that project to continue making forecasts for clients is Good Judgment Inc.
Thanks for flagging. Edited!
Another estimate of ~1%/year US-Russia nuclear war is:
M.E. Hellman, Risk Analysis of Nuclear Deterrence, The Bent of Tau Beta Pi, 2008.
Note that when I plug this distribution into Guesstimate, I get ~0.4% median and ~1.7% mean.
Hi Luisa,
By “nuclear war”, do you mean at least one offensive nuclear detonation?
This got a nice shout-out on Marginal Revolution today.